perm filename THESIS.MSS[RDG,DBL] blob sn#658717 filedate 1982-05-07 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00024 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00003 00002	@INCLUDE(OVERHE.MSS)
C00004 00003	@Chapter[Introduction]
C00007 00004	@Chapter(Definition of Analogy)
C00027 00005	The I/O of the components are shown below in Figure @Ref(Components).
C00029 00006	The rest of this report will elaborate on this figure.
C00031 00007	@Chapter(New Ideas)
C00037 00008	@Chapter(Example - "Tree")
C00040 00009	@Section(I/O Behaviour)
C00049 00010	@Section(Internal Process)
C00058 00011	@BEGIN(FIGURE)
C00061 00012	@Section(Other inputs)
C00077 00013	@Section(Use of this answer)
C00081 00014	@Chapter(Salient Features of the Analogizer)
C00089 00015	@Section(Static Input)
C00092 00016	We now consider the other things the analogizer must know about before
C00098 00017	@Section(Output)
C00103 00018	@Chapter(Scope of Analogizer)
C00114 00019	@Section(Limitation of Scope)
C00116 00020	@Chapter(Research Programme)
C00128 00021	@Section[Programme steps]
C00135 00022	@Subsubsection[Step 4]
C00144 00023	@Section[Evaluation/Validation]
C00155 00024	@Section[Details]
C00159 ENDMK
C⊗;
@INCLUDE(OVERHE.MSS)
@BEGIN(TitlePage)
@BEGIN(TitleBox)
Using Analogies For Knowledge Acquisition

Russell Greiner
@END(TitleBox)

@END(TitlePage)
@Chapter[Introduction]
@Label<Intro>

Analogy plays a role in much of the reasoning processing which
people use everyday.
Analogical reasoning is required to understand metaphors 
(as in "she's a packrat") and
similes (like "he eats like a bird"),
as well as many forms of explanations --
when one phenomenon is described in terms of 
(that is, using the model defined by) another,
as seen in the common
"electicity can be understood using the model by water flow" analogy,
or "consider the atom to be a small solar system".
Much of our speech is laced with "extended" meanings,
based on metaphors we had not realized we were using.
We, for example, consider time a precious commodity,
("he didn't have the time to give"),
indicate emotional makeup in terms of spatial (up versus down) position,
("he was feeling down this morning")
and even consider speech to be "laced".

Such evidence supports the view that the ability to generate and comprehend
an analogy is a tremendously useful and powerful tool
-- not only for allowing efficient communication,
but also for increasing our "understanding" of parts of the world.  

This research has two goals.
We will first present a semi-formal cognitive model which
explains much of the phenomenon of analogy.
The @Cite[WhatAnalogy] paper is a preliminary step in that direction.
This proposal will concentrate on the second goal:
implementing a computer program which uses 
(a subset of this) this analogical capability.
In particular,
this program will permit a domain expert to use analogies when describing
terms (to an expert system),
both to define new terms, and to further specify incompletely described terms.

@Chapter(Definition of Analogy)
This research begins with a definition of (our usage of) the term "analogy",
and a mechanism for deciding how good or appropriate an analogy is.

We are defining an analogy as a shared partial theory.
That is, two objects are analogous if they each satisfy the same
common partial theory, or cpt. (See @Cite(M&M).)
In the example which follows, the cpt was <@i(Hierarchy)>, the theory of
hierarchies.@FOOT{
This definition has much in common with both
@Cite[Hesse]'s "formal analogies",
and the notion of "ground" used in @Cite[Paivio] (in the context of
"topic, vehicle and ground").}

This definition differs from the more standard "mapping approach," using
by most researchers -- including 
@Cite(Winston), @Cite(Carbonella), @Cite(Kling), @Cite(RBrown), 
@Cite(Gentner), @Cite(Hesse) (and, from learning, @Cite(Learn-HR)).
There analogy is considered a mapping,
connecting some attributes of one analogue to corresponding attributes of the other.
As Subsection @Ref(UseOfAnswer) points out, corresponding attributes
of the analogues can be associated using the "common subtheory" approach as well.
The (easier to describe) feature matching method does work quite well
when the only correspondences are at a "surface" level -- 
@i{e.g.}, when the relevant properties to match
are things like number of deaths, or particular steps in a proof.
It runs into quite a bit of trouble mapping "deeper structures"@Foot{
While related, this should NOT be confused with the psycho-linguistic usage
of "deep structure".  We are NOT pretending to have captured some deep semantic
information, either by virtue of using predicate calculus terms, or because
of the ease with which the system can generate new relations.}
-- @i{e.g.}, facts not readily apparent in the attribute-value pairs of some
aspect of the analogues.  A good example of this was the "embedded" hierarchy
in common to both types of trees.
Unless the descriptions of both types of trees already happened to have
this specification, (@i{e.g.}, as appears in @Ref(Use-Hier),)
it is hard to see how (a naive)
feature matching could have notices, much less discovered, this relation.@FOOT{
Indeed, I am prepared to claim that
this common subtheory approach will in fact subsume this other model of analogy.
However, the purpose of this proposal is not to compare these
approaches @i(per se),
but rather to demonstrate considerations of other attempts to deal with
this problem of analogy.  Much more will be said in the thesis itself.}

Neither of these definitions, by themselves, is not terribly useful:
Notice neither tells why a given analogy is proposed,
and not any of the others which are still acceptable by their respective
definitions.
What makes a given analogy appropriate?
The criteria involved in this decision are quite subjective --
different people will propose quite different analogies to answer the
same question.
(There is no single answer to the question "Who is England's first lady?"
-- based on gender, many would insist it has to be Margaret Thatcher,
the female prime minister,
while others, seeing the "Spouse of the Head of State" role as more important,
would argue for her spouse, Dennis Thatcher.
With similiar arguments one could justify
Queen Elizabeth, or Prince Philip, or ... See @Cite[Hof81].)
Furthermore, even a single person may propose a variety of different
analogies depending on the current context:
A spoon is rather like a shovel in terms of its PTRANS-ing function,
whereas it is clearly closer to a knife in terms of size and 
general purpose.)

Despite this, the other analogizing systems mentioned above 
incorporated a single corpus of rules which collectively defined 
how to generate, (and/or evaluate) an analogy.
These rigidly defined what the program could produce.
One of the goals of our program, however, is flexibility --
to be able to accomodate different users, in diverse situations.
We achieve this versatility by employing
a collection of heuristics to guide its decisions
The user will be able to modify these rules to tune the program to
his ideosyncrasies.
In fact, the overall system will 
include a module to help the user to adjust that body of heuristics,
honing it to produce analogies which fit his specifications.
We will return to this point later.

The other major plus is  ... Reformulation
The I/O of the components are shown below in Figure @Ref(Components).

@BEGIN(Figure)

@BEGIN(Center)
@BEGIN[Verbatim]
						/-----------\
	   +---------------> CONTEXT		|  DOMAIN   |
	   |			|		|   FACTS   |
USER - - > |			|		|	    |
	   |			@Z[↑]		    \-----------/
	   |		|---------------|
	   @Z[↑]            |		    |
	INQUIRY	 ---->	|   @i[Analogizer]     |  --->  ANSWER
			|		|	   
			|---------------|	/----------\
				↑		|   META   |
				|		|   FACTS  |
			     /------------\     \----------/
 /	  |-------------| \  |		  |
(USER --> | @i(KB Modifier)   |->) | HEURISTICS |
 \	  |-------------| /  |            |
			     \------------/
@END[Verbatim]
@END(Center)

@CAPTION(Modules)
@TAG(Components)
@END(Figure)
The rest of this report will elaborate on this figure.
The only "required reading" is Section @Ref(NewIdeas),
which discusses, at a very high level, the basic approach we are taking,
emphasizing how this differs from other AI systems,
and why we think this will exhibit impressive behaviour.
Each of the subsequent sections has a less general goal,
and should be read only if the reader wants an answer to the
particular question that section addresses.

The purpose of Section @Ref(Tree-Example) is to convince the reader of
the feasibility of this approach.  
(It presents a representative problem,
illustrating how the individual modules will interact to answer the inquiry.)
Section @Ref(Salient) elaborates the sketch of the analogizer one might
infer from the example.  
This is followed by a description of the scope of this project,
in Section @Ref(Scope).
Here we provide our definition of analogy,
and then list a small collection of instances of analogy
which our system will (eventually) be able to handle.
The concluding Section @Ref(Programme) will outline the research programme
I intend to follow,
and mention the goals (and anticipated caveats and pitfalls) of this work.

@Chapter(New Ideas)
@Label<NewIdeas>
Below is a list of how this work on analogy is
different from previous AI work in this area.

@BEGIN(Itemize)
Definition of analogy@*
Rather than view an analogy as a direct mapping between corresponding aspects 
of the analogues, we take a model-theoritic approach -- defining
an analogy as a shared partial theory.
That is, two objects are considered analogous if they each satisfy the
same theory.
This approach readily accomondates reformulation,
which we consider essential to any analogy-understanding task.

Our approach@*
We are taking an "Expert System" approach to the problem of analogy,
(with a few minor twists).
Expert systems are characterized as knowledge intensive systems, 
containing a large "knowledge base" of facts about the domain.
Here, that domain is analogy rather than, say, chemistry; and
this information is in the form of rules or heuristics.  
Part of Subsection @Ref(KB-Stuff) discusses the types of rules which are required.
(This approach is contrasted with the procedural attitude, in which
some inflexible, hand-crafted function is used to find the analogies.)

Individual analogies@*
This knowledge base will be easy to modify,
to adapt to the biases of individual users.  
This can be compared with other AI systems,
which can find (at most) those analogies which its designer considered worthwhile.
(Note this does NOT mean each user is forced to enter his particular rules.
He may, if he wishes, simply use my set of rules, which result in analogies
I consider reasonable.
The point is he has the option of alterring these rules,
to change the behaviour of the analogizer.
In fact, we even supply tools which facilitate this task.)@Foot{
As a powerful "side-effect" of both above points,
the analogical "matches" this
system will be able to find will not be restricted to
just those connections which are based on relations the user happened to include
in his description of the analogues.
Our system will be able to extend this languages in simple ways 
(@i{i.e.}, to include derivable terms)
which can also be used for this match; and exploited afterward.}

@BEGIN(Multiple)
"Scholarly" research@*
Much AI work are built in an @i(ad hoc) manner,
to solve some small, particular job.  
The result is often of little use, beyond a crude existence proof that this
particular program could be built
-- @i{i.e.}, the "simple" scaling up usually renders the system unusable.
Our goal, here, is a system which can be used
by many different people, over a variety of different domains of inquiry.

To have any chance of success,
it is important to first
"understand" the general idea of analogy, 
(as opposed to simply one man's version of this phenomena,)
before beginning the implementation.
We will, therefore, begin this research effort 
with a literary search which will 
cover work in psychology and philosophy as well as AI,
as well as conduct numerous preliminary dry labs.
We hope to glean the useful characteristics of analogies in this process,
which can be incorporated into the new system.
If (wildly) successful,
the result of this research will be of general interest,
especially to people who theoritically examine such issues,
(such as philosophers of science).
We will, however, be quite happy if these derived heuristics
simply prove to be a good starting set for the Heuristics KB;
and lead to the design of an efficient and powerful Analogizing algorithm.
Perhaps then this system will avoid the obsolescence which has characterized
all prior analogizing system.
@END(Multiple)
@END(Itemize)

@Chapter(Example - "Tree")
@Label<Tree-Example>
This particular example should both motivate why various components 
(shown in Figure @Ref(Components))
are included, and illustrate how they will work.
For pedagogic reasons,
the first pass of this description
will trivialize or even ignore certain aspects of this system.
It will be followed by a corrected (well, less totally fallacious) version.
If possible, try to suspend your disbelief until the end of this appendix.
Subsequent 
sections
will discuss this analogizer in more detail,
provide a collection of additional examples,
and discuss the actual research programme I intend to follow.

@i(Note: I'm currently in the process of developing another, more
meaningful example to replace this strained and useless one.
Feel free to glance at the rest of this section...)

@Comment<
Use another example - perhaps from programming:
Given a program which sorts, using a linked list, construct one which
sorts... >

Computer scientists use the word "tree" to
refer to a certain type of data structure.
Clearly this format description is not the biological entity
which "tree" originally meant.
Computer science has borrowed certain particular aspects of "tree"ness
in their usage of this term, and only these.

We clearly think of these CS trees (CS-trees) as analogous
to natural,  biological trees (N-Trees).
But how?  Which properties should carry over?
Any why those and not others?

@Section(I/O Behaviour)
This is a question the analogizer can try to answer.
The INQUIRY would be a pair of theories,
describing CS-trees and N-Trees respectively.
The question implicitly asked is "How are they related?".

Informally, each CS-Tree is defined in terms of a set of nodes, Nodes,
and a binary relation, LinkedTo.  
The constant Root is a special member
of this Nodes set, defined as its maximal element
with respect to the LeadsTo relation, 
where LeadsTo is the transitive closure@FOOT{
Here TransitiveClosure is defined as the union of one [NOT zero] or
more applications of the "function" -- here, we would say
@BEGIN[Equation]
(@Z(A) x,y. (LinkedTo x y) @Z(g) (LeadsTo x y))
(@Z(A) x,y,z. ((LeadsTo x y) & (LeadsTo y z)) @Z(g) (LeadsTo x z)).
@END[Equation]}
of LinkedTo.

This CS-tree theory is stored as
@BEGIN[Equation]
CS-Tree
  Isa:		Mathematical Structure
  Domain:	Nodes
  Relations:	{}
  Functions:	{LinkedTo}
  Constants:	{Root}
  Facts:	{(@Z(E) LeadsTo. (TransitiveClosure LinkedTo) = LeadsTo  & 
			(AntiReflexive LeadsTo) & (PartialFunction LeadsTo) &
			(@Z(A) x. ((x @Z(B) Nodes) & (x @Z(=) Root)) @~
@Z(g) (LeadsTo x Root)))
@Comment{ or maybe just "the LeadsTo relation is quite important"}
  Refinements:	{BinaryTrees, BalancedTrees, 2-3Trees, ...}
  UsedInField:	{ComputerScience}
  UsedFor:	{SortingRoutines, ...}
  RefersTo:	{DataStructure}

@End[Equation]
@BEGIN(Comment)
(@Z(A) csTree. (csTree @Z(B) (Extension CS-Tree)) @Z(g)
	(@Z(E) Nodes, LinkedTo, Root. (Uses csTree Nodes) & (Nodes @Z(B) SET) &
		(Uses csTree LinkedTo) & (BinaryRelation LinkedTo) &
		(Root @Z(B) Nodes) &
		(@Z(E) LeadsTo. (TransitiveClosure LinkedTo) = LeadsTo  & 
			(AntiReflexive LeadsTo) & (PartialFunction LeadsTo) &
			(@Z(A) x. ((x @Z(B) Nodes) & (x @Z(=) Root)) @~
@Z(g) (LeadsTo x Root)))))

(CS-Tree @Z(B) DataStructure) & (UsedInField CS-Tree ComputerScience) &
(UsedFor CS-Tree SortingRoutines) & ...

(SubTypes CS-Tree {BinaryTrees, 2-3Trees, BalancedTrees, ...})
@END(Comment)

@Comment{  <<HERE>>
The actual "Inquiry" part of the input is ...
In addition,  ... context ...
The result is ...>

We'll now discuss a few nuances of this I/O specification.
First, predicate calculus fanatics should not be disappointed to see this
represented using a quasi-frame system.  Slots are used to stored (only) those
specific facts which in fact use a binary relation, such as Domain or UsedInField.
Other information, in particular relating to the characteristics of the
LeadsTo relation, are encoded in the more natural predicate calculus clause format.
By "compartmentalizing" this information we can then associate meta-level
facts about individual clause 
-- it is easy, for example, to indicate how important a given fact is to the essense
of this concept, or how likely it is to be matched.
We feel this information will play a major role when searching for analogies.
(For example, this is needed to convey facts about the salience of certain
properties.  See @Cite[Ortony1].)

Second, the terms CS-Tree, BinaryTree, 2-3Tree, @i(et cetera),
refer to the @i(concept) of these trees.
Their respective extensions will each be the set of all such trees.
Finally, many of conjuncts in the FACTs above are second-order.
While many could be expressed in first-order,
second-order forms are used because they seem
more perspeciuous --
each represents a more succinct and natural way of expressing some fact
about some relation.@Foot{
For example,
the previous footnote gives another expression for
(TransitiveClosure LinkedTo) = LeadsTo.
These first to second order correspondences are not lost -- they
are stored in the Meta FACTS knowledge base. (See
Subsection @Ref(KB-Stuff).)}

Of course any specific type of CS-Tree will have other facts: for example,
some characterization of its "branchiness" or depth, or a description of
the other (non-pointer) contents of its nodes.  
However, as this is a description of the most general possible CS-Tree,
it intentionally omits all such refinements/further specifications.

The domain-data section of the Domain FACTS Knowledge Base
(again, Subsection @Ref(KB-Stuff) describes this in detail,)
similarly stores a body of facts about each biological tree.
Among other things, each N-Tree has branches, leaves, and a
single trunk; many branches may spring from this trunk, and
each branch may then sprout further branches.@FOOT{
There may be so elegant a theory for such natural trees.
Instead we may have simply a collection of facts about certain
particular trees.  From this we would have to induce those general
characteristics.  Just how this would be done is in my "worry about
later if I have to" category.}
(There are, of course, a host of other facts, about the bark and its color,
its seasonality, its fruit, and various habitats.  
We will ignore these for now; and 
will discuss later (in Subsection @Ref(Context))
how the program also "knew" to ignore them.)

Pointers to these two theories constitute the "INQUIRY" input to the analogizer.
The analogizer's output will be a "subtheory" which is common to both of these 
proposed analogues,
together with a pair of "mapping" (or instantiations,)
from this common partial theory to each of those inputs.

Based on this description provided,
one possible common partial theory is a (theory of a) hierarchy.
That is, both CS- and N- Trees are embodiments of hierarchies,
built from nodes in the first case, and from branches in the other.
The challenge is to find that common part from the information provided.

@Section(Internal Process)
@Label<Internal>
This underlying analogizing process is based on a weak, general feature matcher.
The various decisions -- in particular what to match, and how to evaluate
the "goodness" of a match -- will be based on the heuristics stored in
the Heuristics Knowledge Base,
(which have been input by the individual users).
For example, when comparing two structures (that is, two theories)
one obvious rule is to search for similarities between relations of the
same arity.
This CompareSimilarRelations heuristic, 
(when included in the HEURISTICS Knowledge Base,)
instructs the analogizer to compare the binary relations 
LinkedTo and SproutsInto.
The comparision would be based on
the presence or absence of certain "transferable" properties of the
relations in question.  This would note that both of these two relations 
@BEGIN[Equation]
@TAG(R-Defn)
@BEGIN[Enumerate, Spread 0]
applied to a pair of members of the same space@*
(nodes in one case, branches in the other)

both were anti-reflexive@*
(of course this is only meaningful when (at least a weak case of) 1. holds)
@END[Enumerate]
@END[Equation]

These properties imply (CompareSimilarRelations tells us)
that both of these relations have a meaningful transitive closure.

Another rule, CreateRelevantNewRelations,
would propose creating and examining any new relation(s)
suggested during the investigation of another relation.
It would then (create and) probe the
LeadsTo and SproutsInto*@FOOT{SproutsInto* = (TransitiveClosure SproutsInto).}
relations.
By deduction or inspection, it would see that each of these
@BEGIN(Equation)
@TAG[Hier-Defn]
@BEGIN[Enumerate, Spread 0]
is non-circular (admitted no loops)

had a single maximal element.
@END[Enumerate]
@END(Equation)

The above facts shows clearly that
CS-trees to N-Trees share some interesting properties.
Is this enough to justify calling them analogous?
That is, are we done now? 
The answer is "maybe".
The decision that a meaningful analogy has been achieved,
like all the other decisions mentions thus far,
is quite subjective.
Recall we noted earlier that each type of tree 
included a set of elements and a binary relation.
The fact that we did not even consider stopping then
showns we did not feel that this was sufficient.
(Or at least we feel it should continue searching for additional commonalities.)
We expect any artificial analogizer to do as much.

Here we get to a philosophical point which will be repeated several more times
in this proposal.
Given any question,
such as deciding when to accept a bundle of fact as a legitimate analogy,
there are two distinct approaches to determine the answer.
The computer program
can either "hardwire" in some criteria, or leave it under user control.
As with the rest of this analogizing process,
we chose the latter alternative.
The analogizer will terminate when some heuristic says it should.
These heuristics, like the others, will be subject to user modifications.

One possible terminating-heuristic, InterestingGeneratedRelation,
would see if some indirectly generated relation proved "interesting".
This, in turn, would be true if
a pair of non-trivial properties were discovered,
such as those mentioned above in @Ref(Hier-Defn).
Here the analogizer would be told that this alone is sufficient to
induce a viable analogy.

(There are, of course, other possible termination rules.
Another heuristic may insist we
continue searching for commonalities until 
some bang-for-buck threshold had fallen too low.  
Another approach would involve a large, auxilary knowledge base of known "answers".
[Some of MRG's stuff is not unrelated to this.]
When some rule noted that this pair of facts (given in @Ref(Hier-Defn))
constituted the known mathematical construct, partial ordering,
it would terminate the search,
explaining that this commonality alone is sufficient to justify an analogy.

Another (meta-)statement of philosophy:
at this point in the research,
no one knows what approach will give the best results in general -- 
it might be one of these, or some other totally different one.
One goal of this research is to provide a testbed in which to
test one approach against another, hoping to demonstrate 
their relative advantages, or, perhaps, to motivate
the synthesis of some new tactic.  More on this will appear later.

By whatever process, let us assume the analogizer decides that it has
found a good candidate for an analogy.  
It now has to communicate this finding.
This answer should clearly include the commonality found --
in this case, a description of a hierarchy.
In addition the answer should describe how this hierarchy is embedded in each of
the two models
-- @i{i.e.}, what parts of the CS-tree (respectively N-Trees) form this
partial ordering.

Therefore the answer to the "CS-tree to N-Tree" inquiry will be 
<@i(Hierachy)>, 
that hierarchy partial theory, together with a pair of alists,
to tie this description to each of the inputs (see Figure @Ref(Answer-Fig)).
Each alist will associate each constant, function and relation (symbol)
in that theory of hierarchies
with its "definition" in its description of a CS-tree (respectively N-Tree).
Hence the Maximal constant would be connected with the Trunk constant
in the N-Tree language, and with Root for CS-Trees.
Note the R2* relation maps
onto the non-atomic (TransitiveClosure SproutsInto) for N-Trees
(and the atomic LeadsTo for CS-Trees).

@BEGIN(FIGURE)
@BEGIN(Verbatim, Spread=0)
{<@i(Hierarchy>),
 (<Structure CS-Tree> <Domain    Nodes> <R2    LinkedTo> 
	<Maximal  Root>	<R2* 			    LeadsTo>)
 (<Structure  N-Tree> <Domain Branches> <R2 SproutsInTo>
	<Maximal Trunk> <R2* (TransitiveClosure SproutsInto>)}
@END(Verbatim)
Where <@i(Hierarchy)>, the theory of hierarchies, is
@BEGIN[Equation]
(@Z(A) Instance. (Instance @Z(B) Structure) @Z(g)
	(@Z(E) Domain, R2, Maximal, R2*. (Uses Instance Domain) & (Domain @Z(B) SET) &
		(Uses Instance R2) & (BinaryRelation R2) & (Maximal @Z(B) Domain) &
		(TransitiveClosure R2) = R2*  & 
		(AntiReflexive R2*) & (PartialFunction R2*) &
		(@Z(A) x. ((x @Z(B) Domain) & (x @Z(=) Maximal)) @~
@Z(g) (R2* x Maximal)))))
@End[Equation]

@CAPTION(The answer returned to "CS-Tree to N-Tree" Inquiry)
@TAG(Answer-Fig)
@END(Figure)
@Section(Other inputs)
@Subsubsection(Context)
@Label(Context)
While everything shown above was fairly straightforward,
it seems rather unmotivated; and, as it stands, it should be equally unconvincing.
How did we know to compare the "structure" of these two types of
trees?  Why not contents, 
(@i{e.g.}, Pointers or Integers, versus Squirrels or Nests,)
function, (@i{e.g.}, "to block sunlight" or "to facilitate bifurcation sorting",)
or cutenesses (@i{e.g.}, both appear in environments in which there are bugs)?

Nothing shown above answers that question.
Each of these suggestions constitutes (or could lead to)
a reasonable analogy;
and we have yet to given any reason capable
of eliminating any of them.
Notice that each of these responses is appropriate in some context.
A comedian, for example, might attempt to produce a "strained" analogy
based on some "irrelevant", superficial aspect shared by both types of trees.
In noticing the common name "bug", he might decide this was the criteria
which connected these trees.

The task now is to constrain the analogizer to generate only those analogies
appropriate for a particular situation.
This is what the CONTEXT input is used for.@FOOT{
One approach is specify what sort of question this analogy is
meant to answer --
@i{i.e.}, telling the analogizer why we asked this question in the first place.
The other types of values for this context parameter is still a research
question.}

Note that CONTEXT is the other "dynamic" input to the analogy shown in
Figure @Ref(Components).
(By dynamic we mean it is subject to change with every query.
Both the Heuristics and both Facts databases,
on the other hand, are more static.
The user's only interaction will be during a (possible) initial tuning phase;
after that he will not need to change it.)
The context is used to constrain the space of possible analogies the 
analogizer will propose,
often down to a single mapping.

To indicate the power and generality of this context mechanism,
we will present two different types of context here,
both of which (happen) to lead to the same 
"consider both trees as hierarchies" conclusion.
This first context is (something like)
"to explain the `height' (or `depth') of a CS-tree",
and the second is "generate several, and return the best".

Height for N-Trees are quite straightforward and well-defined:
its height is the distance from the ground on which the tree is rooted,
up to its highest branch.@FOOT<
Formally, we have
@BEGIN[EQUATION]
@TAG(Height-Defn)
Height(tree) = @i(max)@-[{p @Z(B) paths(tree) }]@i(sum)@-[{b @Z(B) p}] Delta-Height(b)
@END[EQUATION]
where paths(tree) is the set of all paths of connected branches
beginning at the trunk, connected by the SprountsInto relation,
and Delta-Height(b) is the change of the y-coordinate of the branch b, from
its origin (where it "sprouted" from another branch 
(or the ground for the tree trunk) to its other end.>
Notice it is independent of occupants of the tree, or its color or purpose.
It depends only on the connectivity of the branches, their relative orientation,
and their respective lengths.

The definition of paths(tree) shows how critical the structure of the tree
is in this calculation.  
(In fact, the only other characterization of the tree is the length and angle
of its various branches.)
This fact is what focussed our (and hence the analogizer's) attention of
the structure of the N-Tree -- and in particular, on the LinkedTo relation.
Examining those sentences which deal with this relation leads to the "answer"
Generated.

While fairly direct, this particular context does seem rather unintuitive --
what aspect of computer science would suggest looking for the height of
an object?
Hence consider the other suggested type of context.
Here the analogizer will go about proposing a few "reasonable" analogies --
ones which follow from the current body of heuristics within some threshold
of immediancy --
determined, perhaps, by bounding the number of heuristics fired, 
or by restricting the CPU time allocated, or the number of CONS cells which
may be used, or some weighted average of these.
These analogies would then be evaluated, and the best candidate would be returned.
This ranking (of the proposed analogies) will be based on heuristics which
the user can alter, of course.
With heuristics like StructuredRelationsAreImportant (see @Cite(Gentner))
pulling for this shared hierarchy analogy,
it is quite likely this analogy would be the one selected.

It is worth reiterating that, as with everything else,
the part of the analogizer which knows how to use these contexts
is NOT built in, but is readily augmentable.
That is, there are certainly other types of context, beyond the two
presented.@FOOT{
One major use of analogies is in theory (not to mention thesis) formation.
(See @Cite(Darden), etc.) 
Finding a connection between the analogues is not enough for this problem --
the goal is to extend such a mapping to find "new" facts about one analogue
based on some corresponding fact in the other.
The context is responsible for communicating this emphasis,
on finding "open" and extendable connections rather than
more closed and complete mappings, to the analogizer.
We are currently trying to understand and formalize this distinction,
rendering it usable.
(This will begin by extending @Cite(Hesse)'s early work.)}
(Creating a taxomony of these methods, and ways of using them, is another
major research area.)

@B(How the CONTEXT is used)@*
Let's return to the first type of context, to examine it in more detail.
This "height" context really contained two chunks of information:
First was the statement that certain N-Tree facts should be considered
more important than the other, and analyzed first.
The second part was a specification of these N-Tree facts 
-- @i(viz.),
those which have something to do with the height of such trees.

That "have something to do with" phrase was intentionally vague.
Once again a set of heuristics are used to determine just what it "means"
-- in this case, what it means for a fact to "have something to do with" height.
Several of the Heuristics in the data base may deal with this issue.
One such rule is FindRelevantFactorsFromContext,
which tells the analogizer to scan through the information
supplied in the context to find which relations, constants, 
@i(et cetera) were explicitly
mentioned, (eg Paths, length, summation), and which were necessary for these
(eg SproutsInto, as it pertains to Paths).
This information is used to rank the various possible perspectives@Foot{
Certain relations seems to naturally fall in the same "category".
A collection of sentences which use only a single category of relations
is considered a single perspective.}
of the tree.
Only the highest one, which deals with structure, is pursued.

@B(Four final comments about context)@*
First, just what does the context input contain,
and how does that differ from the the inquiry and from the Heuristics KB?
The inquiry proper is a pair of proposed analogues, nothing more.
(Each will usually be encoded as a pointer to some statement in the
Domain Facts KB, by the way.)
Anything else which might be pertanent to finding the analogy --
such as assumed role of this analogy, or the current "speaker" (ie author
of this analogy) -- belongs in the context.
This context contains FACTUAL information, as opposed to heuristics.

Second, it is up to the analogizer to decide how to use this information --
the context itself says nothing in this regard.
Notice it was at the urging of the FindRelevantFactorsFromContext rule
that the context was scanned,
looking for information which would narrow the analogizer's search.
(Here, the nature of the Height function told the analogizer to consider
only sentences which deal with structural information.)
Without this rule in the Heuristics KB, this part of the context
(stating that the purpose of this analogy was to explain the `height' of a
CS-Tree) might not have been used; and possibly a different analogy would
have been generated.

A third minor comment addresses the language used for enterring the context.
The term "height", given by itself, is open to many interpretations:
It might, for example, refer to the size of its leaves rather than of the
full tree.
To properly constrain the analogizer,
it is important that only the appropriate meaning of height be used --
that other definition could very well lead to a totally different analogy.
This points to the need to use something as precise 
as predicate calculus when describing each term 
-- the intended meaning can be conveyed unambiguously, as it was
in the definition given in @Ref(Height-Defn).

The final point is a reiteration of how essential the context is to
finding the best analogy.
A different context may very well lead to a different analogy.
For example, in the context of "leaf"s, one might find other mappings,
which rely on the fact that leaves act as "factories",
which supply the energy to the tree
-- which may end up metaphorically as "generating" the tree,
which we might say some style of sorting algorithm does....
In yet another context, the demons which can live in data structures
in general might
be matched with those resident squirrels, and this would constitute the analogy...)
[end of contrived examples]

@Subsubsection(Heuristics and Facts)
The analogizer has access to three other inputs --
two types of FACTS knowledge bases, together with the Heuristics KB.  
While all of these are adaptable (from user to user),
for any given example we will assume they are relatively static.
We will say more about these later, in Subsection @Ref(KB-Stuff).

@Section(Use of this answer)
@Label(UseOfAnswer)
Now that we have this analogy,
what can one do with it?
By construction it addresses the issue mentioned in the context.
(In the case of the first type of context, this tells us what it means
to discuss the height of a CS-Tree.)
But it may have other uses as well.
Notice that much of the (structure-related) vocabulary about CS-trees is 
derived from N-Trees:
including the terms leaves, branchiness, forests and root.
We show below how easy these are to understand,
given the correspondences defined by this common hierarchy.

The answer given to the "CS-tree to N-Tree in the context of Height"
question implicitly defines a mapping, attained by
associating the corresponding elements on the alists returned.
(These can be found by lining up the columns in Figure @Ref(Answer-Fig).)
For example,
using the fact that the R2 relation maps to SproutsInto in the N-Tree case,
and into LinkedTo for CS-trees,
we may associate SproutsInto with LinkedTo.
Similarly the set of Branches should correspond to the Nodes.
Note that this pair of correspondences is consistent 
(@i{i.e.}, defines a homomorphism),
in that the domain of SproutsInto (Branch @i(x) Branch) does indeed match the
domain of LinkedTo -- Node @i(x) Node.

Anyway, consider now what the computer scientist should mean by the term "leaf",
in the context of CS-trees.
We know from N-Trees that
@BEGIN[EQUATION]
	@Z(A)x. (N-Leaf x) @Z(g) ((Branch x) & ¬(@Z(E)y. (SproutsInto x y))).
@END[EQUATION]
Using that mapping defines above, it seems natural to define the CS-Leaf relation as
@BEGIN[EQUATION]
	@Z(A)x. (CS-Leaf x) @Z(g) ((Node x) & ¬(@Z(E)y. (LinkedTo x y))),
@END[EQUATION]
which, lo and behold, is correct.
(Note this does not even consider that silly sounding "leaf as factory" mapping
mentioned above -- only the structural properties of leaves were considered here.)

Of course they don't always carry over so nicely.
And there will be many things which do not carry over at all,
at least not by this connection in this context
-- such as bark, squirrels, and fruit.
(At the end of Subsection @Ref(Context) we saw that these aspects
(such as squirrels) may in fact be carried over from N-Trees to CS-Trees,
in the appropriate context.)

@Chapter(Salient Features of the Analogizer)
@Label(Salient)
So much for the particulars of this single example.   
Let's now reflect in the nature of the analogizer's task,
and repeat many of the more salient features.
This section will discuss the behavior of the Analogizer.

@Section(Dynamic Input)
One input to the analogizer is a pair of proposed analogues.
Each may in fact represent an object in the world, or an event,
a situation, or whatever.
Ideally these would be the actual models of the phenomona.
Unfortunately it is rather difficult to hand a computer program
a biological tree; which forces us into the equally impossible task
of encoding a @i(complete) description of such things.@FOOT{
It is valid to raise a strong objection here -- that whatever encoding
we chose will necessarily omit some of the "essense of tree-ness" we hoped
to convey.
Stronger yet, the facts input to the program will necessarily 
not completely describe the object.
Unfortunately there is no solution to this dilemma.
However we should realize that our own perceptual system is similarly
imperfect --
for example, 
the visual data we "see" has already undergone an incredible amount of
processing, filtering, and re-orientation before we "consciously"
perceive the image.  (Proof: Consider the Necker cube, or Duck-Rabbit,
or almost any of @Cite[Gregory]'s work.)
It is this top-down interpretation, which biases even the low-level
input,
which has deprives our visual system of its "innocence" (see @Cite[Hintikka]).
Of course this is also true for our other senses as well.}
Hence we are forced to represent each analogue using some formalism.
For the non-trivial sort of analogies we hoped to discuss -- those
which do not succumb to simple pattern matching approaches
but rely instead on some deeper connection --
it makes sense to use something as expressive as predicate calculus.
(This is, in particular, in opposition to the Unit-Slot-Value approach
which, while "epistemologically equivalent", 
proves a considerable hindrance when dealing with non-binary relations,
or meta-level assertions.)

The other input which varies with each proposed analogy
is a Context description.
Its mission is to (somehow) convey the current situation to the analogizer.
This "guides" the analogizer towards those certain types of analogy
which are pertanant to this situation,
and away from others.

@Section(Static Input)
@Label(KB-Stuff)
The analogizer also has access to three Knowledge Bases --
containing the analogy-defining heuristics, facts needed for meta-level
reasoning, and domain facts.
While the user is permitted to alter any of these 
(using the KB-Modifier module), all three of these are relatively "static".
That is, after the user has enterred these KBs, 
they will (probably) remain constant over the analogies this user will request.

These rules in the Heuristics KB are the real "smarts" of the entire system.
The previous example shows how they will be used to determine how to
@BEGIN(Itemize)
generate a common partial theory.

conclude this partial theory is enough (@i{i.e.}, termination conditions).

use the context provided when contructing (or evaluating) that theory.
@END(Itemize)
Other heuristics will be used to provide 
@Begin(Itemize)
a set of criteria used when comparing two proposed analogies.@*
(This particular task did not arise in "Explain X" situation.
It is quite important, however, in the "proportional metaphors",
@i<A:B :: C:? for ? @Z(B) 1,2,3,4,5>.
See @Cite(Paivio).)@*
Example: weight binary "structural" relations higher than unary "features"
(taken from @Cite(Gentner)), or consider role related attributes before
physical properties (from @Cite(Carbonella)).

evaluating an analogy in general -- to determine whether a suggested analogy
is acceptable.  (This is quite similar to the above use.)

meta-level rules, to enforce principles which, empirically,
have proven to work quite well.@*
Example: apply domain specific heuristics before the more general ones.
@End(Itemize)

We now consider the other things the analogizer must know about before
it can perform any deductions.
These facts fall into two broad categories -- 
Domain Facts and Meta-Facts

The first collection of facts, housed in the Domain FACTS KB,
describe the facts pertanent to domain(s) of the analogy.
In the example given in Section @Ref(Intro),
we had to know various things about biological
triees, (and possibly things about roots, or squirrels, vegetation, @i(et cetera),)
as well as facts about the field of computer science, especially regarding
data structures, sorting routines, and the like.
All such facts will be stored here.  
The analogizer knows to look up any domain term in this reference bank
-- hence the actual inquiry needs only use the term N-Tree to 
conjure up the entire theory of N-Trees.
There is very little we can say about this part of the knowledge base --
both its structure and its contents are input by the user, as needed
for his examples.@FOOT{
This thesis is NOT concerned with the mechanics of how these updates are 
performed -- or even when 
(whether only at the start of a session, or whenever needed,
as the user enters a novel term(.  For the duration of
this proposal we will make the simplifying assumption that this type of
information will be available as the analogizer needs it, in the best
possible form; period.}

The other body of facts, in the Meta-Facts KB,
deals with the domains of predicate calculus statements and inferencing
in general.@Foot{
Yes, it is slightly misnamed.  I just got tired of the proliferation of
KBs, and decided no harm will come of merging the "Second-Order Facts" KB
with the needed Meta-Facts one.  We'll see.}
It contains two types of information.
The first partition deal with both relations.
Here is a list of the types of properties a relation can have --
both purely syntactic ones, like arity, and more semantic
features, such as symmetries or transitivity.
This category also includes "methods" for generating
new types of relations from existing ones --
for example, using the TransitiveClosure function to derive LeadsTo from LinkedTo.
The other category handles modes of inferencing, both of this analogizer and of
other "reasoner"s.
Included here are ways to use knowledge to speed up deductions,
and ways of "simulating" an inferencing process, to model another's logic 
(given his body of knowledge and reasoning skills).

During the first phase of this thesis work we will be building
the necessary set of these meta-level facts required to operate the analogizer
effectively.  We will defer further discussion of its contents until
after this stage.

A seperate, very important module of this overall system is the KB-Modifier.
Its function is to help the user modify those collections of facts and heuristics.
The same mechanism used to alter Facts will be available for the heuristics.
In that latter case,
the KB-Modifier provides
tools which facilitate both the generation and incorporation
of new heuristics, (or even new types of heuristics,) and the modification
of existing rules.
The thesis itself will discuss this at length; this proposal will not.

@Section(Output)
The answer to such a request-for-analogy inquiry is a triple,
consisting of a partial theory,
together with a pair of mappings which demonstrate how the two inputs
each satisfy that partial theory.
Note that different members of this trio will be relevant.
for different applications.

We saw that this pair of mappings from theory to (respective)
analogues can be used to define
a mapping connecting certain subparts of one analogue to the other.
This mapping can first be used to connect elements in one domain to those
in the other -- as Root maps into Trunk.
One can also be used to associate arbitrary predicate
symbols in one theory with corresponding symbols in the other --
for example, the SproutsInto relation, for N-Trees,
mapped into the CS-tree relation LinkedTo.@FOOT{
The (special) domain element to domain element mapping is
really a subcase of the general mapping shown above,
as these domain elements are essentially constants,
and therefore can be viewed as zero-ary relations.}

@Section(Complexity)
Now that the I/O properties of this analogizer have been specified,
what can be said of its complexity?
Clearly its task would have been much easier
(even trivial) if it knew beforehand about hierarchies,
and had @i(a priori) defined these trees in terms of hierarchies,
as in
@BEGIN[EQUATION]
Satisfies( CS-Tree, Hierarchy( LinkedTo, Nodes, Root ) ) 
@TAG(Use-Hier)
Satisfies( N-Tree,  Hierarchy( SproutsInto, Branches, Trunk ) ).@FOOT{
The astute reader will note that, given the naivity of the analogizer,
even this powerful information, by itself, will not guarantee any results.
That would require a heuristic such as
@BEGIN(EQUATION)
Seriously consider the Satisfies relation,
	(especially when it deal with a complicated or useful theory).
@END(EQUATION)
as well as a domain fact like
@BEGIN(EQUATION)
The theory of Hierarchies is both complicated and useful.
@END(EQUATION)}
@END[EQUATION]

In fact, this description does seem reasonable in this simplistic example.
However this assumption will be unwarranted in (most) other examples.
That is, the commonality which joins a pair of objects 
will seldom be well known
(@i{i.e.}, among the set of things one would expect another person to know about).

Of course, if one already happened to have such a category,
it should by all means be used.
However this approach should be applicable in other situations
as well, where that commonality is NOT as explicit.
Once again we see that knowledge can do much to constrain search.

@Chapter(Scope of Analogizer)
@Label<Scope>
Our initial (grandious) goal will cover several other cases of analogy,
each with a different type of "Inquiry".

@Section(Further examples)
@Label(Examples)
Given this definition, we can now consider instances of analogy.
We are designing this system to be able to handle each of
the following queries:

@BEGIN[ENUM1]
Explain the computer science use of the term "tree".

What is a bachelor bear?

Why is learning at CalTech like sipping water from a fire hydrant?

@BEGIN[Multiple]
a. ewe : cow :: lamb : ?@*
@TABDIVIDE(6)
@\1. bull@\2. calf@\3. bovine@\4. puppy@\5. child

b. language : phoneme :: music : ?@*
@\1. interval@\2. pitch@\3. timbre@\4. melody@\5. harmony
@END[Multiple]

Who is the first lady of England?

(Comparing plays) Is Romeo & Julliet more like Hamlet, or Westside Story?

Given a theorem, X, about a certain type of groups, formulate a new theory,
X', which holds for a "corresponding" type of rings.@*
<<Here -- maybe sphere -> circle, or triagle -> tetrahedron>>@*
<<Or maybe go from PROOF of X to PROOF of X' - ala Carbonell?>>

How is the American economy like a rubber band?

View the heating mechanism of a blow drier as a toaster-like device.
@END[ENUM1]

Notice each of the above examples/queries has a different type of answer:

@BEGIN[ENUM1, LeftMargin = +5, Indent = -5, Numbered = <#@1 >]
was discussed at length above.
It requires understanding how the meaning of a term can be extended to a
new domain (here computer science from botany),
based on observing some common attribute (here the underlying structure)

is similar -- how can the meaning of bachelor be stretched to apply to bears.
(Realize, of course, that bachelor is really only defined for humans...)

again demonstrates the sharing of some common characteristic (here the
attempt to ingest a small amount of some substance being forced out).

@BEGIN(Multiple)
a. asks which of the five terms relates to lamb in the same way cow does to ewe;

b. has a similar goal, but here the "answer" is not (as) well defined,
@END(Multiple)

is rather ambiguous -- see @Cite[Hof81] for a list of some appropriate
answers.

seems a simple comparison.  However it becomes quite clouded when you realize
one has no @i(a priori) basis of comparison -- should it be in terms of
number of characters (either in the story itself, or in the name
of the play), total number of murders, author, or ... (Yes, it is true none
of the other examples pre-define the axis to be used for the comparison
either.  We included this example because its dependency on
conext is so apparent.)

is a traditional form of analogy (see @Cite[Kling]) which requires the
recognition that certain properties and relationships carry over isomorphically from
one field to (a subset of) another.

was thrown in to demonstrate people's incredible ability, shows that one can
pretty readily invent some mapping to explain almost any "How is X like Y"
question -- in fact, several.

shows that analogies may be "representational" as well as "linguistic".
It may be useful to represent a blow-drier as a composite of many
pieces, one of which is like a toaster, another like a fan, another ...
A good description of how the blow-drier works can be attained
by fitting together these partial descriptions.
(See @Cite[Dietterich].)

@End[ENUM1]

It is beyond the scope of this proposal to discuss these other examples,
even to the sparse detail given the first tree case.
It will be possible to ask each of these to the analogizer.
The format of the question, and the answer given, will appear in a followup
report, currently in preparation.

@Section(Limitation of Scope)
Diverse as these examples seem,
they all take (essentially) a pair of proposed analogues
and return a partial theory which both model satisfy.
There are other forms of analogy.
One example is the inquiry
"What is like X", as is "Given Analogue@-(1), (some
hint of) a common theory, and a new domain, D, find the best Analogue@-(2),
in D".@Foot{
For example, we might give a description of CS-Tree (omitting its give-away
name) for the first analogue, something like <@i(Hierarchy)> for the cpt,
and the domain of physical objects
(a superclass of biological things); and ask for the best match,
expecting, of course, that N-Tree be returned).}
Another analogy-related question is "How is X NOT like Y?".
@Comment(Darwin wondered how natural selection differed from domestic selection.)
Eventually we may address these types of questions, but not yet.

@Chapter(Research Programme)
@Label(Programme)
Everything up until here in this proposal had the goal of convincing
readers that this task was indeed feasible 
-- that it is possible to
build an intelligent, general analogizer; and that the internal heuristics
needed to guide it were fairly easy to generate.
This rest of report will outline the author's intended research programme,
to culminate in the implemention of that computer program,
capable of generating and using analogies to answer questions.

@Section[Goals of the system]
@Label[Goals]
For any project, and especially one which centers around a term
used as commonly as analogy,
it is necessary to spell out what the research objectives really
are, and are not.
Below we list (our current aspirations of) the objectives we hope to achieve:

@BEGIN[Enumerate]
To emperically verify the utility of our model of analogy:
that two models are analogous if they share a common partial theory. 
(We will examine a wide range of analogies,
showing that each such instance can be explained within this definition.)
@Comment <This will be basically a dry lab.  Is there any way, Mike?>

To experimentally test the theory that a 
relatively small body of heuristics can effectively guide a
weak, general program to find acceptable analogies.
(By acceptable we mean the deduced analogy will be meaningful to the person
who contributed those heuristics.)

The implementation of that general analogizing program, 
which is capable of deducing analogies between a pair of things based on
some particular set of heuristics.

A formal description of the "space" of those analogy-specifying heurstics.
This descriptive language will be sufficient
to allow any new user to specify his particular heuristics.
In addition, we will implement a program which "compiles" such specifications
into functional heuristics, which are then used by the general analogizing
program.

@END[Enumerate]

Note here what was NOT mentioned above:
We are NOT attempting to model either human behaviour,
or the human understanding process.
nor is our goal to provide a definitive, final description for these
loaded terms.
We are neither claiming any psychological credibility to these definitions,
nor suggesting that they match any criteria given in any philosophical treatise.
Worthwhile as these goals are, ours is quite different.
@Section[Programme steps]
The various steps are listed below, in rough chronological order.
Note some tasks may be started before the preceeding one has been finished.

@Subsubsection[Step 1]
Our first goal is to define what we mean by analogy,
and classify its many different senses.
This will require enumerating many types of tasks which seem to
require some analogizing ability, and in each
case, determine what types of questions this capability tries to answer.
(The examples given in Section @REF(Examples) constitute my initial stab at this
categorization.)
An offshoot of this study will be some prelimary ideas about "naive analogy" --
that is, a weak, but useful, description of analogy, @i(a la) Hayes' naive physics
(see @Cite(NaivePhysics).)
During this phase we will restrict the goal of the eventual program
to a subset of these many uses, and decide in which domains the program will work.

@Subsubsection[Step 2]
(This will actually be performed in parallel with Step 1 above.)
We will collect a large, and hopefully diverse, corpus of examples of analogies,
taken both from common usages and from various pieces of scientific literature,
(including both philosophical discussions of analogy,
as well as "domain" texts,
in which the author uses some deductive process which appears analogical.)

At this time we will show a number of colleagues this list,
and ask each to each begin now generating their own cluster of,
say, ten different analogies.
Several months later,
during Step 6 of this programme,
they would reveal this previously-secret (and probably unformulated)
analogy-ettes.

Each of these examples will be a small and local instances of a small analogy,
similar to @Cite(Miller)'s "local metaphors".
We are intentionally avoiding analogies which require vast amounts of domain
knowledge at this stage.
Our goal is to describe analogical processing, and not
any particular domain; and the overhead required to enter a non-trivial
body of domain data will considerably slow down the real research part of this task.

Unfortunately, the pitfalls of this approach are equally apparent:
These examples will be meaningless if we "cheat" --
by encoding the facts in some contrived manner,
one which "gives away" the analogy.
For this reason, this first pass set of examples will NOT be the sole basis
for evaluating this approach.

Note this "cheating" concern is further addressed in Step 7,
when a large and more completely specified domain is used.
By taking some pre-exising collection of facts, 
design to solve some actual domain problems,
we are spared the argument that we specifically configured the data
to give away the analogy.

@Subsubsection[Step 3]
We will then analyze these data points, trying to tease out
the relevant heuristics.  This will rely heavily on introspection --
by asking how I could have found these connections,
and why only these, and not other (sillier) ones?

<<Doug - here is one place your expertise will be particularly welcome --
as an authorities on heuristic-izing...>>

During this interval I will also attempt to read through much of the
philosophical literature, hoping to glean relevant rules and thoughts.
Among the readings will be @Cite[M&T], @Cite[Polya1]/@Cite[Polya2],
@Cite[Gentner], @Cite[Hesse] and @Cite[Lakoff].
<<Lindley - perhaps you could augment this collection of must-reads --
including your own recent thoughts as well?  Thanks.>>
Various AI works will be reread as well -- both those containing the word 
"analogy" in their title, including 
@Cite[Kling], @Cite[Evans], @Cite[Ana],
and @Cite[Carbonella]/@Cite[Carbonellb],
and those in related fields, such as Learning 
(including @Cite[Learn-HR], @Cite(AM), and ?)
and problem reformulation -- such as @Cite[Amarel] and @Cite[Tappel].

@Subsubsection[Step 4]
All the above was a (necessary) preliminary dry-lab.  
Now to begin storing the information on a computer.  
The first decision is how to encode these rules.
Of course (at this stage) the syntax of the rules,
as well as the representations on which they operate,
will be heavily dependent on these particular initial examples we used.@FOOT{
In the long run, we anticipate being able to go from these 
specific rules to the more fundamental and "abstract"
(@i{i.e.}, nearly domain independent) underlying heuristics.
Anyway, the existant rules (on both levels) will be written
with the additional goal of (later) understanding these rules in terms of the more
general ones
-- @i{i.e.}, we will attempt to formalize the 
types of "specifying transformations" which take (unusable) abstract rules
into applicable, low-level rules.}
Three more parts of the neo-natal system are required:
First the "interpreter" must be written --
this is the actual program which, given a body of heuristics, is able to perform
(a simplified version of) the actual analogizing task. 
The other parts are designing and building the other KBs -- of Meta-Facts and
relevant Domain Facts (for the domains to be used in this test).

@Subsubsection[Step 5]
(Done in parallel with Step 4.)
During the previous step, we will note convenient ways of enterring these rules --
and how to encode the representation itself, to facilitate such
future input.  This will be used in the second, major part of this thesis work.

@B[*-*-* Benchmark *-*-*]@*
There will now be a running system, which uses a (background)
Knowledge Base of heuristics, to find analogies, or rate (evaluate) a pair
of analogies, etc.
It should be quite easy to alter 
these heuristics themselves,
using representation-independent tools.

@Subsubsection[Step 6]
At this point, we will test our (implicit) claims of generality and sufficiently:
We will see if these rules and (proto-)analogizer "work" on that
heretofore hidden second list of analogies.
(A subset of) the relevant questions here are:
@BEGIN[Enumerate, Spread=0, Numbered=<(@i) >, Referenced <(@i)>]
Was the current system able to discover the appropriate analogies?
@TAG(Apt-Analogy)

If not, was it easy to add in the new set of heuristics?
@TAG(Add-New)

Should these new heuristics be deducable from the more general ones?
@TAG(Deduce-New)

How should that body of general heuristics be changed/augmented to handle
these cases?
@TAG(Major-Change)

Was the overall interpreter sufficient for this task?
@TAG(Interpreter)
@END[Enumerate]

It is not clear what the desired answers to these questions are.
I expect the interpreter will be sufficiently simple and straightforward that
it can now be pretty much frozen -- finding that
@Ref(Interpreter) is True supports this position.
As analogies as quite subjective,
@Ref(Apt-Analogy) may be False in several cases.
This is the @i(raison d'etre) for
the KB-Modifying module should in Figure @Ref(Components).
If this works as well as we hope,
@Ref(Add-New) should be true
-- changing these heuristics should be trivial.

<<Perhaps I should adopt something like DEW's 20 minute criteria?>>

While it would be nice if 
@Ref(Deduce-New) was true at this point, that is quite unlikely.
(Indeed, without knowing the "shape" of the space of heuristics,
there is no reason to suppose that it will ever be true -- 
there may not ever be such a spanning set of general heuristics...)
Anyway, adding new rules and changing existing ones, 
[during the process implied by
@Ref(Major-Change)]
will serve to test the rule-modification part of the system.

On a related topic, we may need addition types of second-order or meta-level
assertions to do this addition processing.  These too will have to be
added.  We feel the same basic KB-Modifier that was capable of twiddling
the heuristics will be able to perform this task.

@Subsubsection[Step 7]
Now onto the big stuff: If the small, "toyish" examples above have
worked, we will then start entering new domains (each along with a description
of its notation/syntax/representation...).
The goal will now be to find useful,
relevant "natural" connections between pairs of them.  Maybe by
then some of the domains will have sufficiently rich causal
models that significant new observations/discoveries can now
be made. Maybe...

The domain facts should be tapped directly from the "real system",
with a minimum number of alterations --
the smarts should be in those heuristics, not in the 
particular, fortuitous, pre-selected representation.

One "real" task would be to expand an existing knowledge base to include a new
class of objects.  
For example, the current Molgen (@Cite(Molgen), @Cite(Units)) KB has a variety
of facts which deal with restriction enzymes.  
Imagine now we wanted to add in a new class of (non-restriction) enzymes
to this knowledge base.
Whle this is conceptually quite straight forward,
it nonetheless requires a great amount of work:
each rule has to be examined,
and if it actually deals with restriction enzymes,
a new analogous rule has to be produced to handle
(the similar case for) those new enzymes.
When posed this way, this search and modify task becomes an
obvious analogy exercise:
find the <NewRule#N> which satisfies the relation
RestrictionEnzyme : Rule#N :: NewClassOfEnzymes : <NewRule#N>.
This particular task has a direct "success criteria":
does MOLGEN now do the correct thing with these new enzymes?
(I am indebted to Professor Buchanan for proposing this application.)
@Section[Evaluation/Validation]

What are the criteria by which this program can be evaluated? 
The purpose of this programme is to test (and hopefully to verify) the
four hyphotheses mentioned in Subsection @Ref[Goals].
As with all (reputable) AI programs,
the actual proof will be in the form of a running system.
What must this system be able to do for this program(me) to be judged successful?
Below we list some desirable results:

@BEGIN[Enumerate]
Certainly the system should be able to derive the appropriate analogies
for all of the initial examples.  
This will not be too surprising --
all of the initial work was geared towards solving these particular problems.
Failure here will imply either something is intrinsically and fundamentally
wrong with the model of analogies we are using;
or that the sheer size and scope of the task is overwhelming.
In either case, see the then-in-press ThesisProposal#2, by [Greiner >81].

@BEGIN[Multiple]
It would be nice if the system could generate
satisfactory solutions to all members of that second set of analogies,
on either the first attempt, or after adding in a few more rules.
Of course, this simply begs the question: what constitutes "satisfactory"?
For this instance, it would be sufficient if the analogies were acceptable
to the authors, as it was their heuristics which were used.

Hence this particular "benchmark" is fairly vaccuous.
The following proposed criteria fortunately do not share this limitation.
@END[Multiple]

@BEGIN[Multiple]
We will consider this system successful if it
can be geared up to "really" handle non-trivial examples,
as demonstrated by successful performance of the type mentioned in Step 7.
Here the system's generality will be verified (well, suggested)
by the ease with which these diverse new domains were added in.
This criteria for easiness will be only slightly subjective.
We will later specify some standard -- perhaps that a competent, if
new, user can enter the new heuristics in less than X amount of time,
after the minor recoding of the domain data, and a few trial runs.

Notice this will address another pair of major concerns we now have:
@BEGIN[ITEMIZE]
Will this scale up?  Just as the best way to reach the moon is NOT by
building bigger ladders, so it may be possible that approaches which
work well for small examples may fail when attacking a larger problem.
The problem may be due to some type of combinatorial explosion,
or to some deeper problem with the conceptual framework used.
For example, we may have modularized this problem in a totally
inappropriate manner.  While benign for small cases, this problem
will present itself when any real analogy is addressed.

@BEGIN(Multiple)
How dependent is the system on the particulars of a given representation?
This issue is closely related to the previous point, 
both in that it may only be problematic for non-trivial exercises, and
in being very important when discussing systems which
claim to solve a general problem in a general way.  
Given how easy it is to encode significant parts of the answer
just by the formalism used,
there is serious doubt about the generality of any mechanism
which can only solve those problems presented using a particular representation.

The following two examples should illustrate what we mean by representation,
and the kind of pitfall we are trying to avoid.
@BEGIN(Itemize)
Many AI representation languages include a built in inheritance scheme.
Notice this means "sibling" relations can be found for almost nothing --
one simply finds another child of the same parent.
While this relation is indeed quite useful,
it accounts for only one type of analogy.
In fact, there is the risk that it may detract from finding other 
(more correct) mappings.
It also forces every item to be entered into this hierarchy,
independent of whether this categorization is essential (or even relevant).

Another more pointed example arises from the
Conceptual Dependency graphs many cognitive scientists insist on using.
This view of the world is very helpful in many situations where 
human actors perform in expected, stereotyped ways 
-- but can be very misleading in other situations.
@COMMENT<when dealing with mechanical objects 
(did the teeter-totter "want" to tilt the
other way, and thereby cause harm to befall Freddie, who was rudely dumped off)>
Trying to squeeze every sequence of actions and causations into this model
can be not only time consuming but unproductive as well.
For example, its notation may make it quite difficult to avoid
assigning volition to mechanical objects, as they appear analogous to
other "spirited" actors.
@END(Itemize)

Any system which insists on some particular set of vocabulary will be
useless for other problems.
We are designing the analogizer with an eye towards generality --
its internals will include NO pre-defined, specially handled
terms, relations, @i(et cetera).
We will be quite disappointed to find any such bias has crept in.
Feeding this system with examples generated for other purposes,
by other people, should supply the varied selection of
diverse formalisms we need to really test this claim of
representation-independence.
@END(Multiple)
@END[ITEMIZE]

Even now this program may still be subject to the accusation that it only
handles computer related analogy -- and that our model is invalid outside
this limited scope.  Hence the next test:
@END[Multiple]

Evaluation by "man in the street" experiment:
Almost any person can understand any analogy, either immediately or after
some quick explanation (@i(a la) "think of them both as ...", or ...)
Similarly most people can complete sentences like 
"John is like a bird because ...".
While the goal of this system is NOT to match the people's amazing
ability of analogy,
this program would be an (unqualified) success if it attained this
level of competence:  if it could understand anybody's favorite
analogy, after that person had added in his set of heuristics;
especially if this "learning" phase was quick and painless.
<Next level: to understand person X's analogy by (meta)analogy to other
analogy he had generated.  I'll not worry about that just yet.>
@END[Enumerate]

As the actual program is designed, it will be clear which
modules were responsible for which parts of these overall goals.
At that point we can address the "credit assignment" problem, to consider
what it means if any of the above considerations
@Ref(Apt-Analogy) - @Ref(Major-Change) were to fail.

@Section[Details]

A number of lower level declarations still
need to be pinned down.
This subsection gives starting values for these parameters.

@BEGIN[ITEMIZE]
Time scale @*
The overall time for this project, from thesis proposal through
final thesis writeup - should be a little over a year.
The full coding should be completed within the year --
by about June 1982.
Further details, tying a schedule to each of the steps, with milestones,
@i(et cetera), will be forthcoming.

Representation Language@*
Our task clearly requires a versatile, and extensible language.  For reasons
given in @Cite[RLL], there are very few languages which qualify.
The two which do are RLL and MRS.
Based on my (over)familiarity with RLL,
this is be the initial system used.
We reserve the right to eventually switch to MRS, if we feel
MRS has something to offer beyond RLL's capability.
(Perhaps its capability to use (meta-)description of various representations
will be the deciding factor.)

Actual domains to analogize@*
This is still a large question mark.  There are many reasons why this domain
must be "artificial", as we will have
to be able to perform experiments in this domain.
The description of Eurisko provides a more complete explanation for this.
Current proposals: Part of programs? Data structures? Games?

"Staff"@*
The bulk of the work will be done by the author (Russell Greiner),
with much advise from Professors Douglas Lenat and Micheal Genesereth.
Professor Lindley Darden's continuing participation will be quite welcome.
Other possible members on the (proposed) reading committee are
Professors Bruce Buchanan and Terry Winograd, and
Doctors John Seely-Brown and Rick Hayes-Roth.
I hope to work closely with other students whose theses deal, in some way,
with things like analogy, including Paul Cohen, Tom Deitterich, and
Steve Tappel.

Other Collaboration@*
Artificial Intelligence researchers are certainly not the only people who
have investigated the nature of analogies, and try to understand their
role in cognition.
In this research effort, we will make every effort to use all the results
psychologists can proffer, as well as the suggestions stemming from the
questions philosophers have queried.
@END[Itemize]